287 research outputs found

    Approximating Clustering of Fingerprint Vectors with Missing Values

    Full text link
    The problem of clustering fingerprint vectors is an interesting problem in Computational Biology that has been proposed in (Figureroa et al. 2004). In this paper we show some improvements in closing the gaps between the known lower bounds and upper bounds on the approximability of some variants of the biological problem. Namely we are able to prove that the problem is APX-hard even when each fingerprint contains only two unknown position. Moreover we have studied some variants of the orginal problem, and we give two 2-approximation algorithm for the IECMV and OECMV problems when the number of unknown entries for each vector is at most a constant.Comment: 13 pages, 4 figure

    Origin of rat β-globin haplotypes containing three and five genes

    Get PDF
    We have reported in rat three adult β-gene haplotypes containing either five or three genes. Detailed sequence analysis reveals that the leftmost gene is the major gene and that at the opposite end downstream lies the minor gene. All of the genes lying between them are minor-major hybrids indicating their origin by unequal crossing-over. In two haplotypes β-globin genes were found with an L1 element inserted directly into IVS2. The described results allow the formulation of a pathway of mutational events leading from the ancient two-β-gene rodent ancestor through a three-gene haplotype to five-gene haplotypes, one of which is postulated to have arisen in common laboratory strains since their capture in the wild.[https://academic.oup.com/mbe/article/7/5/407/1061225

    Significant abundance of cis configurations of coding variants in diploid human genomes

    Get PDF
    To fully understand human genetic variation and its functional consequences, the specific distribution of variants between the two chromosomal homologues of genes must be known. The 'phase' of variants can significantly impact gene function and phenotype. To assess patterns of phase at large scale, we have analyzed 18 121 autosomal genes in 1092 statistically phased genomes from the 1000 Genomes Project and 184 experimentally phased genomes from the Personal Genome Project. Here we show that genes with cis-configurations of coding variants are more frequent than genes with trans-configurations in a genome, with global cis/trans ratios of ∼60:40. Significant cis-abundance was observed in virtually all genomes in all populations. Moreover, we identified a large group of genes exhibiting cis-configurations of protein-changing variants in excess, so-called 'cis-abundant genes', and a smaller group of 'trans-abundant genes'. These two gene categories were functionally distinguishable, and exhibited strikingly different distributional patterns of protein-changing variants. Underlying these phenomena was a shared set of phase-sensitive genes of importance for adaptation and evolution. This work establishes common patterns of phase as key characteristics of diploid human exomes and provides evidence for their functional significance, highlighting the importance of phase for the interpretation of protein-coding genetic variation and gene function

    Untreated PKU patients without intellectual disability: SHANK gene family as a candidate modifier

    Get PDF
    Phenylketonuria (PKU) is an inborn error of metabolism caused by variants in the phenylalanine hydroxylase (PAH) gene and it is characterized by excessively high levels of phenylalanine in body fluids. PKU is a paradigm for a genetic disease that can be treated and majority of developed countries have a population-based newborn screening. Thus, the combination of early diagnosis and immediate initiation of treatment has resulted in normal intelligence for treated PKU patients. Although PKU is a monogenic disease, decades of research and clinical practice have shown that the correlation between the genotype and corresponding phenotype is not simple at all. Attempts have been made to discover modifier genes for PKU cognitive phenotype but without any success so far. We conducted whole genome sequencing of 4 subjects from unrelated non-consanguineous families who presented with pathogenic mutations in the PAH gene, high blood phenylalanine concentrations and near-normal cognitive development despite no treatment. We used cross sample analysis to select genes common for more than one patient. Thus, the SHANK gene family emerged as the only relevant gene family with variants detected in 3 of 4 analyzed patients. We detected two novel variants, p.Pro1591Ala in SHANK1 and p.Asp18Asn in SHANK2, as well as SHANK2:p.Gly46Ser, SHANK2:p.Pro1388_Phe1389insLeuPro and SHANK3:p.Pro1716Thr variants that were previously described. Computational analysis indicated that the identified variants do not abolish the function of SHANK proteins. However, changes in posttranslational modifications of SHANK proteins could influence functioning of the glutamatergic synapses, cytoskeleton regulation and contribute to maintaining optimal synaptic density and number of dendritic spines. Our findings are linking SHANK gene family and brain plasticity in PKU for the first time. We hypothesize that variant SHANK proteins maintain optimal synaptic density and number of dendritic spines under high concentrations of phenylalanine and could have protective modifying effect on cognitive development of PKU patients

    Sequencing by Hybridization of Long Targets

    Get PDF
    Sequencing by Hybridization (SBH) reconstructs an n-long target DNA sequence from its biochemically determined l-long subsequences. In the standard approach, the length of a uniformly random sequence that can be unambiguously reconstructed is limited to due to repetitive subsequences causing reconstruction degeneracies. We present a modified sequencing method that overcomes this limitation without the need for different types of biochemical assays and is robust to error

    Identification of FVIII gene mutations in patients with hemophilia A using new combinatorial sequencing by hybridization

    Get PDF
    Background: Standard methods of mutation detection are time consuming in Hemophilia A (HA) rendering their application unavailable in some analysis such as prenatal diagnosis. Objectives: To evaluate the feasibility of combinatorial sequencing-by-hybridization (cSBH) as an alternative and reliable tool for mutation detection in FVIII gene. Patients/Methods: We have applied a new method of cSBH that uses two different colors for detection of multiple point mutations in the FVIII gene. The 26 exons encompassing the HA gene were analyzed in 7 newly diagnosed Italian patients and in 19 previously characterized individuals with FVIII deficiency. Results: Data show that, when solution-phase TAMRA and QUASAR labeled 5-mer oligonucleotide sets mixed with unlabeled target PCR templates are co-hybridized in the presence of DNA ligase to universal 6-mer oligonucleotide probe-based arrays, a number of mutations can be successfully detected. The technique was reliable also in identifying a mutant FVIII allele in an obligate heterozygote. A novel missense mutation (Leu1843Thr) in exon 16 and three novel neutral polymorphisms are presented with an updated protocol for 2-color cSBH. Conclusions: cSBH is a reliable tool for mutation detection in FVIII gene and may represent a complementary method for the genetic screening of HA patients

    Identification of cancer predisposition variants in apparently healthy individuals using a next-generation sequencing-based family genomics approach

    Get PDF
    Cancer, like many common disorders, has a complex etiology, often with a strong genetic component and with multiple environmental factors contributing to susceptibility. A considerable number of genomic variants have been previously reported to be causative of, or associated with, an increased risk for various types of cancer. Here, we adopted a next-generation sequencing approach in 11 members of two families of Greek descent to identify all genomic variants with the potential to predispose family members to cancer. Cross-comparison with data from the Human Gene Mutation Database identified a total of 571 variants, from which 47 % were disease-associated polymorphisms, 26 % disease-associated polymorphisms with additional supporting functional evidence, 19 % functional polymorphisms with in vitro/laboratory or in vivo supporting evidence but no known disease association, 4 % putative disease-causing mutations but with some residual doubt as to their pathological significance, and 3 % disease-causing mutations. Subsequent analysis, focused on the latter variant class most likely to be involved in cancer predisposition, revealed two variants of prime interest, namely MSH2 c.2732T>A (p.L911R) and BRCA1 c.2955delC, the first of which is novel. KMT2D c.13895delC and c.1940C>A variants are additionally reported as incidental findings. The next-generation sequencing-based family genomics approach described herein has the potential to be applied to other types of complex genetic disorder in order to identify variants of potential pathological significance

    Routes for breaching and protecting genetic privacy

    Full text link
    We are entering the era of ubiquitous genetic information for research, clinical care, and personal curiosity. Sharing these datasets is vital for rapid progress in understanding the genetic basis of human diseases. However, one growing concern is the ability to protect the genetic privacy of the data originators. Here, we technically map threats to genetic privacy and discuss potential mitigation strategies for privacy-preserving dissemination of genetic data.Comment: Draft for comment
    corecore